A Comparison of Methods for Data-Driven Cancer Outlier Discovery, and An Application Scheme to Semisupervised Predictive Biomarker Discovery

نویسندگان

  • Seppo Karrila
  • Julian Hock Ean Lee
  • Greg Tucker-Kellogg
چکیده

A core component in translational cancer research is biomarker discovery using gene expression profiling for clinical tumors. This is often based on cell line experiments; one population is sampled for inference in another. We disclose a semisupervised workflow focusing on binary (switch-like, bimodal) informative genes that are likely cancer relevant, to mitigate this non-statistical problem. Outlier detection is a key enabling technology of the workflow, and aids in identifying the focus genes.We compare outlier detection techniques MOST, LSOSS, COPA, ORT, OS, and t-test, using a publicly available NSCLC dataset. Removing genes with Gaussian distribution is computationally efficient and matches MOST particularly well, while also COPA and OS pick prognostically relevant genes in their top ranks. Also our stability assessment is in favour of both MOST and COPA; the latter does not pair well with prefiltering for non-Gaussianity, but can handle data sets lacking non-cancer cases.We provide R code for replicating our approach or extending it.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Proteomics Applications in Health: Biomarker and Drug Discovery and Food Industry

Advancing in genome sequencing has greatly propelled the understanding of the living world, however, it is insufficient for full description of a biological system. Focusing on, proteomics has emerged as another large-scale platform for improving the understanding of biology. Proteomic experiments can be used for different aspects of clinical and health sciences such as food technology, biomark...

متن کامل

Proteomics Applications in Health: Biomarker and Drug Discovery and Food Industry

Advancing in genome sequencing has greatly propelled the understanding of the living world, however, it is insufficient for full description of a biological system. Focusing on, proteomics has emerged as another large-scale platform for improving the understanding of biology. Proteomic experiments can be used for different aspects of clinical and health sciences such as food technology, biomark...

متن کامل

Pharmaceutical Advances and Proteomics Researches

Proteomics enables understanding the composition, structure, function and interactions of the entire protein complement of a cell, a tissue, or an organism under exactly defined conditions. Some factors such as stress or drug effects will change the protein pattern and cause the present or absence of a protein or gradual variation in abundances. Changes in the proteome provide a snapshot of the...

متن کامل

Pharmaceutical Advances and Proteomics Researches

Proteomics enables understanding the composition, structure, function and interactions of the entire protein complement of a cell, a tissue, or an organism under exactly defined conditions. Some factors such as stress or drug effects will change the protein pattern and cause the present or absence of a protein or gradual variation in abundances. Changes in the proteome provide a snapshot of the...

متن کامل

Performance comparison of four commercial GE discovery PET/CT scanners: A monte carlo study using GATE

  Combined PET/CT scanners now play a major role in medicine for in vivo imaging in oncology, cardiology, neurology, and psychiatry. As the performance of a scanner depends not only on the scintillating material but also on the scanner design, with regards to the advent of newer scanners, there is a need to optimize acquisition protocols as well as to compare scanner ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2011